Collaborative Entity Extraction and Translation

نویسندگان

  • Heng Ji
  • Ralph Grishman
چکیده

Entity extraction is the task of identifying names and nominal phrases (‘mentions’) in a text and linking coreferring mentions. We propose the use of a new source of data for improving entity extraction: the information gleaned from large bitexts and captured by a statistical, phrase-based machine translation system. We translate the individual mentions and test properties of the translated mentions, as well as comparing the translations of coreferring mentions. The results provide feedback to improve source language entity extraction. Experiments on Chinese and English show that this approach can significantly improve Chinese entity extraction (2.2%relative improvement in name tagging F-measure, representing a 15.0% error reduction), as well as Chinese to English entity translation (9.1% relative improvement in F-measure), over state-of-the-art entity extraction and machine translation systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

تشخیص اسامی اشخاص با استفاده از تزریق کلمه‌های نامزد اسم در میدان‌های تصادفی شرطی برای زبان عربی

Named Entity Recognition and Extraction are very important tasks for discovering proper names including persons, locations, date, and time, inside electronic textual resources. Accurate named entity recognition system is an essential utility to resolve fundamental problems in question answering systems, summary extraction, information retrieval and extraction, machine translation, video interpr...

متن کامل

Presenting a method for extracting structured domain-dependent information from Farsi Web pages

Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...

متن کامل

The Effects of Collaborative Translation Task on the Apology Speech Act Production of Iranian EFL Learners

The present study aims to investigate the relative effectiveness of different types of pragmatic instruction including two collaborative translation tasks and two structured input tasks with and without explicit pragmatic instruction on the production of apologetic utterances by low-intermediate EFL learners. One hundred and fifty university students in four experimental groups and one control ...

متن کامل

Development of EFL Teachers’ Engagement and Professional Identity: The Effect of Discussing Teacher Competences via E- Collaborative Discussion Forum

This study is a mixed method research that investigated the effect of electronic collaborative discussion forum on Iranian EFL teachers' engagement and professional identity and their development in terms of teachers‘ competences as they were engaged in collaborative teacher inquiry. For this purpose, 5 EFL teachers participated in 11 online forum discussion sessions. Before participating in di...

متن کامل

بهبود شناسایی موجودیت‌های نامدار فارسی با استفاده از کسره اضافه

Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007